AITopics | iterative optimization

59d4e18a60490b9ed9913f3be2b14839-Paper-Conference.pdf

Neural Information Processing SystemsJun-17-2026, 12:22:46 GMT

The remarkable success of the autoregressive paradigm has made significant advancement in Multimodal Large Language Models (MLLMs), with powerful models like Show-o, Transfusion and Emu3 achieving notable progress in unified image phenomenon: understanding the understanding and generation.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

e7407ab5e89c405d28ff6807ffec594a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 13:18:03 GMT

annotation, clusternet, point cloud, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
(2 more...)

Add feedback

c6e3856954d23bec921f2d13d8c0e0e7-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 20:48:58 GMT

cvpr, dataset, point cloud, (8 more...)

Neural Information Processing Systems

Country: Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Surfing: Iterative Optimization Over Incrementally Trained Deep Networks

Neural Information Processing SystemsDec-26-2025, 02:03:19 GMT

We investigate a sequential optimization procedure to minimize the empirical risk functional $f_{\hat\theta}(x) = \frac{1}{2}\|G_{\hat\theta}(x) - y\|^2$ for certain families of deep networks $G_{\theta}(x)$. The approach is to optimize a sequence of objective functions that use network parameters obtained during different stages of the training process. When initialized with random parameters $\theta_0$, we show that the objective $f_{\theta_0}(x)$ is ``nice'' and easy to optimize with gradient descent. As learning is carried out, we obtain a sequence of generative networks $x \mapsto G_{\theta_t}(x)$ and associated risk functions $f_{\theta_t}(x)$, where $t$ indicates a stage of stochastic gradient descent during training. Since the parameters of the network do not change by very much in each step, the surface evolves slowly and can be incrementally optimized. The algorithm is formalized and analyzed for a family of expansive networks. We call the procedure {\it surfing} since it rides along the peak of the evolving (negative) empirical risk function, starting from a smooth surface at the beginning of learning and ending with a wavy nonconvex surface after learning is complete. Experiments show how surfing can be used to find the global optimum and for compressed sensing even when direct gradient descent on the final learned network fails.

gradient descent, iterative optimization, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

A Robust and Non-Iterative Tensor Decomposition Method with Automatic Thresholding

Hasegawa, Hiroki, Okada, Yukihiko

arXiv.org Artificial IntelligenceNov-3-2025

Recent advances in IoT and biometric sensing technologies have led to the generation of massive and high-dimensional tensor data, yet achieving accurate and efficient low-rank approximation remains a major challenge. Most existing tensor decomposition methods require predefined ranks and iterative optimization, resulting in high computational costs and dependence on analyst expertise. This study proposes a novel tensor low-rank approximation method that eliminates both prior rank specification and iterative optimization. The method applies statistical singular value hard thresholding to each mode-wise unfolded matrix to automatically extract statistically significant components, effectively reducing noise while preserving the intrinsic structure. Theoretically, the optimal thresholds for each mode are derived from the asymptotic properties of the Marcenko-Pastur distribution. Simulation experiments demonstrate that the proposed method outperforms conventional approaches (HOSVD, HOOI, and Tucker-L2E) in both estimation accuracy and computational efficiency. These results indicate that the proposed approach provides a theoretically grounded, fully automatic, and non-iterative framework for tensor decomposition.

artificial intelligence, decomposition, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.06203

Country: Asia > Japan (0.15)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

e7407ab5e89c405d28ff6807ffec594a-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 15:03:25 GMT

annotation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
(2 more...)

Add feedback

c6e3856954d23bec921f2d13d8c0e0e7-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 20:12:12 GMT

artificial intelligence, dataset, machine learning, (9 more...)

Neural Information Processing Systems

Country: Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration

Zhang, Hang, Zhang, Yuxi, Wang, Jiazheng, Chen, Xiang, Hu, Renjiu, Tian, Xin, Li, Gaolei, Liu, Min

arXiv.org Artificial IntelligenceJun-26-2025

Recent developments in neural networks have improved de-formable image registration (DIR) by amortizing iterative optimization, enabling fast and accurate DIR results. However, learning-based methods often face challenges with limited training data, large deformations, and tend to underperform compared to iterative approaches when label supervision is unavailable. While iterative methods can achieve higher accuracy in such scenarios, they are considerably slower than learning-based methods. To address these limitations, we propose VoxelOpt, a discrete optimization-based DIR framework that combines the strengths of learning-based and iterative methods to achieve a better balance between registration accuracy and runtime. VoxelOpt uses displacement entropy from local cost volumes to measure displacement signal strength at each voxel, which differs from earlier approaches in three key aspects. First, it introduces voxel-wise adaptive message passing, where voxels with lower entropy receives less influence from their neighbors. Second, it employs a multi-level image pyramid with 27-neighbor cost volumes at each level, avoiding exponential complexity growth. Third, it replaces hand-crafted features or contrastive learning with a pretrained founda-tional segmentation model for feature extraction. In abdominal CT registration, these changes allow VoxelOpt to outperform leading iterative in both efficiency and accuracy, while matching state-of-the-art learning-based methods trained with label supervision.

artificial intelligence, machine learning, registration, (15 more...)

arXiv.org Artificial Intelligence

2506.19975

Country:

Asia > China (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.51)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

From Few to Many: Self-Improving Many-Shot Reasoners Through Iterative Optimization and Generation

Wan, Xingchen, Zhou, Han, Sun, Ruoxi, Nakhost, Hootan, Jiang, Ke, Arık, Sercan Ö.

arXiv.org Machine LearningFeb-1-2025

Recent advances in long-context large language models (LLMs) have led to the emerging paradigm of many-shot in-context learning (ICL), where it is observed that scaling many more demonstrating examples beyond the conventional few-shot setup in the context can lead to performance benefits. However, despite its promise, it is unclear what aspects dominate the benefits and whether simply scaling to more examples is the most effective way of improving many-shot ICL. In this work, we first provide an analysis of the factors driving many-shot ICL, and we find that 1) many-shot performance can still be attributed to often a few disproportionately influential examples and 2) identifying such influential examples ("optimize") and using them as demonstrations to regenerate new examples ("generate") can lead to further improvements. Inspired by the findings, we propose BRIDGE, an algorithm that alternates between the optimize step with Bayesian optimization to discover the influential sets of examples and the generate step to reuse this set to expand the reasoning paths of the examples back to the many-shot regime automatically. On Gemini, Claude, and Mistral LLMs of different sizes, we show that BRIDGE to significant improvements across a diverse set of tasks, including symbolic reasoning, numerical reasoning, and code generation.

large language model, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2502.0033

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
North America > Canada > Ontario > Toronto (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Surfing: Iterative Optimization Over Incrementally Trained Deep Networks

Neural Information Processing SystemsJan-27-2025, 16:37:27 GMT

The paper proposes a new method for provably fitting deep generative models to observations, a highly non-convex optimization problem. Instead of trying to find the latent code that explains the measurements directly, as proposed by Bora et al. this paper starts with a different deep generative model that has random weights, for which Hand et al. showed that gradient descent provably works. Then they incrementally modify the weights of the generator to approach the true generator while using the previous optimum as a starting point. This sequence of models can be snapshots of the model during the training process. The main result is a theory that shows that a warm-started non convex optimization in expansive Gaussian networks yields successful recovery.

deep network, iterative optimization, surfing, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.55)

Add feedback

Filters

Collaborating Authors

iterative optimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

59d4e18a60490b9ed9913f3be2b14839-Paper-Conference.pdf

e7407ab5e89c405d28ff6807ffec594a-Paper-Conference.pdf

c6e3856954d23bec921f2d13d8c0e0e7-Supplemental-Conference.pdf

Surfing: Iterative Optimization Over Incrementally Trained Deep Networks

A Robust and Non-Iterative Tensor Decomposition Method with Automatic Thresholding

e7407ab5e89c405d28ff6807ffec594a-Paper-Conference.pdf

c6e3856954d23bec921f2d13d8c0e0e7-Supplemental-Conference.pdf

VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration

From Few to Many: Self-Improving Many-Shot Reasoners Through Iterative Optimization and Generation

Reviews: Surfing: Iterative Optimization Over Incrementally Trained Deep Networks